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Statistical Model Checking (SMC) is a trade-off between testing and formal verification. The core 
idea of the approach is to conduct some simulations of the system and verify if they satisfy some 
given property. In this paper we show that SMC is easily parallelizable on a master/slaves architecture 
by introducing a series of algorithms that scale almost linearly with respect to the number of slave 
computers. Our approach has been implemented in the UPPAAL SMC toolset and applied on non- 
trivial case studies. 

1 Introduction 

Computers play a central role in modern societies and their errors can have dramatic consequences. 
For example, such errors could jeopardize a banking system, possibly stalling the economy of a whole 
country or, more dramatically, endanger human life through the failure of some safety critical systems 
(railway signaling, integrated avionics, air-traffic, medical life support machines, automotive electron- 
ics). It is therefore not surprising that proving the correctness of computer systems is a highly relevant 
problem. Unfortunately, the growing complexity in system design makes it almost impossible to ensure 
correctness simply by looking at the (possibly distributed) code. Automatic techniques are thus needed. 

The most common method to ensure the correctness of a system is testing (see [3 ] for a survey). After 
the computer system is constructed, it is tested using a number of test cases with predicted outcomes. 
Testing techniques have shown effectiveness in bug hunting in many industrial problems. Unfortunately, 
testing is not always the perfect solution. Indeed, since there is, in general, no way for a finite set of 
test cases to cover all possible scenarios, errors may remain undetected. There are also methods that can 
ensure the full correctness of a system. Those methods, also called formal methods, use mathematical 
techniques to check whether the system will behave correctly for all possible scenarios. Over the past, 
formal methods such as symbolic model checking [ 14] have been used to verify systems with more than 
10 20 reachable states |4). 

In an ideal world, it would thus be "better" to use formal methods rather than testing ones. Unfortu- 
nately, improvements in development of formal methods do not seem to follow the increasing complexity 
in system design. Nowadays, most of formal methods suffer from the so-called state-space explosion 
problem, which makes them unenforceable to large industrial case studies. As testing does not suffer 
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from the same problem, it remains the only scalable technique and it is thus the one promoted by the 
industrials. 

As we already said, the major drawback with testing is that, in general, it does not give any confidence 
on the correctness of the entire system. This lack of accuracy has motivated the development of new 
algorithms that combine testing techniques with statistical algorithms. These techniques, also called 
Statistical Model Checking techniques (SMC) ifTTl [T51 |2"0l , can be seen as a trade-off between testing 
and formal verification. The core idea of the approach is to conduct some simulations of the system 
and verify if they satisfy some given property. The results are then used together with algorithms from 
the statistical area in order to decide whether the system satisfies the property with some probability. 
Statistical model checking techniques can also be used to estimate the probability that a system satisfies 
a given property ifTTlfTOl . Of course, in contrast to an exhaustive approach, a simulation-based solution 
does not guarantee a correct result with 100% confidence. However, it is possible to bound the probability 
of making an error. Simulation-based methods are known to be far less memory and time intensive than 
exhaustive ones, and are sometimes the only option li22l [121 . Among existing SMC algorithms, one 
find those that use a fixed number of samplings (those to estimate the probability) and those that support 
sequential sampling (those that test an estimate of the probability provided by the user) where the number 
of simulation is not known in advance IfTTl . 

Statistical model checking gets widely accepted in various research areas such as software engi- 
neering, in particular for industrial applications, or even for solving problems originating from systems 
biology El[T3l. There are several reasons for this success. First, SMC is very simple to understand, 
implement, and use. Second, it does not require extra modeling or specification effort, but simply a 
stochastic operational sematics of the model that can be used as the basis for simulation and checked 
against state-based properties. Third, it allows to verify properties (5] 13 that cannot be expressed in 
classical temporal logics. 

However, SMC is not a panacea and many huge size problems are still beyond its scope. Indeed, 
sometimes the algorithm needs a lot of simulation to compute, and the computation of each simulation 
may be time consuming. There are two solutions to this problem. The first solution is to propose 
new algorithms and heuristics to reduce the number of simulations needed for the algorithm to reach a 
decision. The second approach consists in taking new and emerging platforms into account. This paper 
goes for the second solution. A trend to speed up computation time and hence to improve the efficiency 
of SMC is certainly to take advantage of the new technologies among which one find our ability to use 
several computers working in parallel. In fact, it is well-known that statistical solutions methods that use 
samples of independent observations are often trivially parallelizable (see the work on Metropolis and 
Ulam). As observed by Youness, SMC algorithms can be distributed through the help of a master/slave 
architecture where multiple computers are used to generate the simulations. The idea is as follows: one 
or more slave processes register their ability to generate simulation with a single master process that is 
used to collect those simulations and peform the statistical test. As pointed out by Youness |[2TTl . in order 
to ensure that simulations are independent, some care needs to be taken when generating pseudorandom 
number on each machines, which can easily be solved by incorporating the number of each processor 
in the generation of theses numbers [21]. When using sequential testing, the situation becomes more 
complex as it is important to guarantee that the technique will not introduce a bias against simulations 
that take a longer time to generate. The latter can be done by computing an a priori to the order in 
which simulations are taken into account. Defining this order so that the algorithm scales up linearly 
with the number of slave processors may be complex and remains a major challenge through distributing 
sequential algorithms. 

In this paper, we report on the implementation of a new methodology we use to parallelize the 
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statistical model checking algorithms we developed for model checking stochastic timed automata (7j [8) 
against weighted temporal logic properties. Those SMC algorithms, which have been implemented in 
Uppaal-SMC- a SMC extension of the Uppaal toolset [16] - rely on Wald's sequential hypothesis 
testing (used to test a probability) and Monte Carlo simulation (used to estimate a probability). Our 
approach, also implemented in Uppaal-SMC, scales better than the one of Youness. Moreover, we show 
how to perform parameter estimation with SMC. The latter approach can be used to optimize a given 
algorithm (what is the best network topology, the best transmission rate, ...) in an efficient manner. Our 
approach is applied to non-trivial case studies. 

2 Statistical Model Checking 
2.1 The model 

In this section, we briefly recap the concept of Priced Timed Automata (PTA), see 13 for more details. 
We denote 38{X) to be a finite conjunction of bounds of the form x ~ n where x G X, n G IN, and 
~G {<,<,>,>}. A Priced Timed Automaton (PTA) is a tuple srf = (L,£o,X,E,R,I) where: (i) L is a 
finite set of locations, (ii) £q G L is the initial location, (iii) X is a finite set of real- valued variables called 
clocks, (iv) E C L x 38 (X) x 2 X x L is a finite set of edges, (v) R : L — > Z>o assigns a rate vector to each 
location, and (vi) / : L — > 38{X) assigns an invariant to each location. A state of a PTA is a pair (/, v) that 
consists of a location / and a valuation of clocks V :X — > R>o- From a state (/, v) G L x R> a PTA can 
either let time progress or do a discrete transition and reach a new location. During time delay clocks are 
growing with the rates defined by R(l), and the resulting clock valuation should satisfy invariant /(/). A 
discrete transition from (Z,v) to (/',v') is possible if there is (l,g,Y,l r ) G E such that v satisfies g and V is 
obtained from v by resetting clocks from the set Y to 0. A run of PTA is a sequence of alternating time 
and discrete transitions. 

Several PTA Mi,M2, . . . ,M n , can be put in parallel via message passing in order to form a network 
Mi \\M2W ■ ■ ■ \\M„ of PTAs. By message passing, we mean that the automata can synchronize on some 
transitions and exchange messages through input and output actions. 

In order to perform SMC on PTAs, we have to equip them with a stochastic semantic. The lat- 
ter being needed to define a probability space over the sets of their executions. Giving details on the 
stochastic semantic of PTAs is beyond the scope of this paper but details are available in [7 ]. Roughly 
speaking, the stochastic semantic associates probability distributions on both the delays one can spend 
in a given state as well as on a transition between states. In general one considers uniform distribution 
for bounded delays and exponential for the case where a component can remain indefinitely in a state. 
As observed in Q, though the stochastic semantic of each individual PTA is rather simple (but quite 
realistic), arbitrarily complex stochastic behavior can be obtained by their composition when mixing in- 
dividual distributions through message passing. The beauty of our model is that these distributions are 
naturally and automatically defined by the network of PTAs. 

Our implementation supports extensions of PTA, coming from the language of the Uppaal model 
checker |[T6l . Such models can contain integer variables that can be present in transition guards, and they 
can be updated only when a discrete transition is taken. Additionally, we support other features of the 
Uppaal model checker's input language such as data structures and user-defined functions. 

A parametrized PTA M(p) is a PTA in which some integer constant (transition weight or constant in 
variable assignment/clock invariant) is replaced by a parameter p. 

For defining properties we use weighted temporal logic PWCTL, which contains formulas of the 
form O c <c(p. Here c is an observer clock (that is never reset and should grow to infinity on any infinite 
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run of PTA), C G R>o and <p is a state-predicate. We say that a run % satisfies \\f = O c < c <p if there exists 
(I, v) € 71 such that / satisfies (p and v(c) < C. We denote by Pr^\\\i\ the probability that a random run of 
the model si satisfies \\f. 

2.2 Statistical Model Checking for NPTAs 

The problem of checking Pr^ [O c <c<p] > /? being a PTA and G [0, 1]) is unfortunately undecidable 
in general Q. Our solution is to approximate the answer using simulation-based algorithms known under 
the name of statistical model checking algorithms. We briefly recap statistical algorithms permitting to 
answer the following two types of questions : 

1 . Testing: Is the probability Pr^ [O c <c<p] for a given NPTA greater or equal to a certain threshold 

e ? 

2. Estimation: What is the probability Pr^[O c <c^>] for a given NPTA g/l 

From a conceptual point of view both solving the two above questions via SMC is simple. First, each 
run of the system is encoded as a Bernoulli random variable that is true if the run satisfies the property 
and false otherwise. Then a statistical algorithm groups the observations to answer the two questions. For 
the qualitative question, we shall use sequential hypothesis testing, while for the quantitative question we 
will use an estimation algorithm that ressemble the classical Monte Carlo simulation. The two solutions 
are detailed hereafter. 

Sequential Sampling/Testing This approach reduces the qualitative question to the test the hypothesis 
H : p = P £ /(^c<c < P) > against K : p < 8. To bound the probability of making errors, we use strength 
parameters a and /3 and we test the hypothesis Ho : p > po and H\ : p < p\ with po = 8 + 8q and 
p\ = 8 — 8[. The interval po — p\ defines an indifference region, and po and p\ are used as thresholds 
in the algorithm. The parameter a is the probability of accepting Hq when Hi holds (false positives) and 
the parameter j8 is the probability of accepting H\ when Hq holds (false negatives). The above test can 
be solved by using Wald's sequential hypothesis testing ifTTIl . This test computes a proportion r among 
those runs that satisfy the property. With probability 1, the value of the proportion will eventually cross 
log(j8/(l — a) or log((l — /$)/ce) and one of the two hypothesis will be selected. 

Estimation This algorithm ifTTl computes the number of runs needed in order to produce an approxi- 
mation interval [p — e,p + e] for p = Pr(\j/) with a confidence I — a. The values of e and a are chosen 
by the user and the number of runs relies on the Chernoff-Hoeffding bound. 

3 Distributed Statistical Model-Checking 

We report on preliminary results on using distributed computing to speed-up SMC algorithms. We start 
by discussing the solution for hypothesis testing where the number of simulations needed by the test is not 
known in advance. A naive solution in distributing the generation of the runs may give rise to a bias in the 
result, as pointed by Younes [20]. In short, some computers may generate (for example) positive answers 
more quickly than some other computers, which may bias the decision toward the positive answer. This 
would not happen when computing runs sequentially. In general, the time required to generate runs may 
not be uniform and can cause this type of bias. To counter this, Younes |[20l proposed a round-Robin 
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solution where the runs are counted in rounds. To improve performance, Younes defined safe lower and 
upper bounds on the Binomial random variable that represents the sum of all the positive realisations, 
i.e., all the simulation that do satisfy the property. Instead of waiting for the results of all the nodes, if a 
result is missing the lower and upper bounds are used to take a safe decision. This has the potential to 
reduce the execution time since decisions may be taken earlier. 

We generalize Younes' algorithm by sending the result of simulations by batches and also by im- 
plementing a buffer of incoming result. The batch is used to reduce communication by sending an 
aggregate result of predefined size (instead of individual results). The buffer is used to improve con- 
currency since the nodes are more loosely synchronized. We experiment on these two dimensions for 
different topologies, while Younes' algorithm is the particular case where both are equal to one, which 
is not very scalable since this generates a lot of traffic and the nodes are more synchronized. Figure [T] 
shows the time it took to verify that the mutual exclusion property of the train-gate example distributed 
with Uppaal holds with probability 98% configured with 20 trains and 99.999% confidence. We show 
the results for different topologies of our cluster, NxPxC where N is the number of nodes, P the number 
of processors per node, and C the number of cores per processor. 




Figure 1 : Time to check for mutual exclusion for 20 trains qualitatively. 

We see, modulo experimental variational, that the algorithm improves when the batches or buffer are 
increased but then it becomes quickly insensitive to these parameters. 

Distributing the estimation algorithm is much simpler. We need a fixed number of runs determined 
by the Chernoff 's bound ifTTTl to conclude on a probability value with given confidence level. This is 
an embarrassingly parallel problem since we can simply divide the work equally and gather the result 
at the end. To compensate for fluctuations in the cluster, we could implement work-stealing but as our 
experiments show, this does not seem to be necessary since the observed performance scales almost 
linearly. The loss in efficiency in the later cases exhibits the overhead of starting up all the processors 
(around 3-4 seconds), which would be compensated for much longer runs. Figure Q] shows running time 
and relative efficiency for checking a few quantitative properties on the Firewire and LMAC protocol!. 

4 Distributed Parametric Model-Checking: The Principle 

In many practical cases system behaviors depend on the values of a finite set of constant parameters. For 
instance, these parameters can define network topology, or transmission rate of a node. 

An interesting question might be to study how a system behavior depends on the values of these 
parameters. This may include visualisation of this dependency (drawing plots), optimization/worst case 

2 Clusters are shared resources with varying load so results are expected to vary. 
3 The model and properties are available on http://people.cs.aau.dk/~adavid/smc/ 
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23.1s 
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0.91 


0.84 


0.84 


0.69 


0.57 


0. 95 


0.89 
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0.34 



Table 1 : Time in seconds and efficiency (italic) to checking quantitative properties on the Firewire and 
LMAC model in function of the number of nodes (N), processors per node (P) and cores per processor 
(Q. 

analysis and determining the correlation between different parameters. Another example that we will 
study below is computing Nash Equilibrium in wireless ad-hoc networks, e.g. choosing a network con- 
figuration that is stable with respect to the behavior of selfish nodes. 

Let us assume that there is a finite set of parameters, each defined on a finite domain. We will model 
parameterized systems using Uppaal models in which some integer constants (transition weights or 
constants in variable assignment/clock invariants) are parameterized, e.g. they are replaced by special 
syntactic constructs that define the sets of possible values. Currently, we support two constructs: 

• #range (a, b) defines the set of all integers between a and b, 

• #booleanmatrix(N) defines the set of all boolean matrices of size N, this construct can be used 
to represent the set of all possible topologies of a network with N nodes. 

We developed a framework for solving the "parametric" problems listed above (visualisation, opti- 
mization/worst case analysis, Nash Equilibrium computation). In order to solve all these problems our 
implementation performs a series of invocations of Uppaal-SMC for different values of parameters. 
These invocations are independent of each other, thus they can be easily distributed on highly hetero- 
geneous clusters. Our implementation uses the SLURM batch system [ 19], or it can submit jobs to the 
computational nodes using SSH connection by its own. 

5 Distributed Parametric Model Checking: Case-Studies 
5.1 Traingate example 

We consider a model of a railway bridge [ 1 8 ] where several trains are crossing a bridge with one track. 
Our Uppaal model is depicted on Fig. [2 Trains start in the Safe initial location where they are not 
approaching. They will leave that location and be approaching (and go to location Appr) with an arrival 
rate given by the expression 1 : #range (1 , 20) on the figure. This is a parameter declaration that will 
be used to generate models with values 1:1, 1:2, ...1: 20. This expression (of the form i : j) is 
an extension of UPPAAL and defines an exponential distribution with the rate j to pick the delay from. 
When a train is approaching, it enters Appr and synchronizes with appr [id] ! . The gate controller will 
know that train id is approaching. After 10 time units the train will be crossing (enter location Cross, 
unless it is stopped before by the gate controller. This is done with the synchronization stop [id] ? and 
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the train goes to the location Stop. From there, it is restarted with the synchronization go [id] ? by 
the gate controller and after 7 time units it will be crossing. After crossing, trains leave the bridge with 
leave [id] ! and are safe again and can decide to approach again. 

The gate controller keeps track of stopped trains with a FIFO queue (not depicted here) that we will 
not detail. Trains are queued and dequeued with this queue with the help of functions as seen on the 
figure. The gate has two main states Free and Occ (i.e. occupied) that keeps track of the state of the 
bridge. If trains are approaching then it either stops them if the bridge is occupied or let them pass 
otherwise. When the bridge becomes free (one train leaves), the controller decides to restart a train at the 
front of the queue with go [f ront ()] ! . 



x>=3 Free 




e : id_t 
e == front() 
leave[e]? 
dequeue() 



Figure 2: Uppaal models of a train (left) and a gate controller (right). 

Here a (qualitative) safety means to ensure that at most one train can be in the crossing at the same 
time, and such property can be checked using classical UPPAAL model checker. On top of that, now 
Uppaal-SMC can also evaluate probabilistic (quantitative) properties. For instance, we can estimate the 
probability that the first train will cross the bridge within 50 time units by checking a PWCTL property 
<>time<50 (Train(0). Cross) . 

Consider two parameters in our model: the number of trains, and the rate with which these trains are 
coming. The rate parameter is on location 'Safe shown in Fig. [2, and the number of trains is declared 
similarly in the System declarations. 

Fig. [3] depicts the results of a parameter sweep of this model. The plot shows that when the number 
of trains increases, the probability that the first train will cross the bridge within 50 time units decreases. 
Indeed, it is more likely that it will be stopped by other trains (there are more) and spend time in the 
Stop location. When the arrival rate is decreased, the probability also decreases. 

5.2 Nash equilibrium Aloha CSMA/CD protocol 

Aloha protocol [1] is a simple Carrier Sense Multiple Access with Collision Detection (CSMA/CD) 
protocol that was used in the first known wireless data network developed at the University of Hawaii 
in 1971. The protocol assumes that there are several nodes that share the same wireless medium. Each 
node is listening to its own signal during its transmission and checks that the signal is not corrupted by a 
simultaneous transmission by another node. In case of collision both nodes stop transmitting immediately 
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Figure 3: Parametric sweep for the traingate model. 



and wait for a random time before they try to transmit again. 

The Uppaal model of a single node is given in Fig.[4j We consider unslotted Aloha where the nodes 
are not necessary synchronized. Additionally, we study the p-persistent variant of Aloha, i.e. a protocol 
implementation in which a random delay before retransmission is distributed according to a geometric 
distribution. This means that in each time slot a node transmits with probability TransmitProb and 
waits for one more slot (and then decide again) with probability 1— TransmitProb. 



INITIAL 



100-TransmitProb 
x:=0 



x<=1 I 

energy'==1 && 
time' == 1 




Figure 4: Model of Aloha in UPPAAL 



In our experiments we assumed that the goal of a node is to transmit a single frame within 50 time 
units and to limit energy consumption by 3. This goal for a node i can be expressed using the following 
PWCTL formula: 

Vi = < >Node(i).time<5o( Node (j)- ns > 1 /\Node(i). energy < 3) (1) 

Then the utility function [/, of a node i is equal to the probability that the goal y/,- is satisfied by a 
random run of a system, i.e: 



Ui(puP2,---,Pn) =Pr[S(pi,p 2 ,...,PN) H ¥>] 



(2) 
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, where pj is equal to the value of TransmitProb chosen by node j. 

We consider the case where there is a master node that knows the network configuration (here the 
number of nodes) and broadcasts the value of TransmitProb parameter to all the nodes. Now, if there are 
selfish nodes, they can change their values of TransmitProb to achieve better performance (and other 
nodes will suffer from that). Thus, the interesting question is to find the value of TransmitProb that 
satisfies Nash Equilibrium (NE). For such a value, it is not profitable for any node to alter its behaviour 
to the detriment of other nodes. For our case the network is symmetric, thus we can search for NE from 
the point of view of the first node only. In other words, parameter p satisfies NE, iff Uo(p,p, . . . ,p) is 
larger than Uo(p',p, . . . ,p) for any p'. 



Table 2: Nash equilibrium (NE) and Symmetric optimal (Opt) strategies for Aloha. 



Number of nodes 
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3 


4 


5 


6 


7 


NE strategy p NE 


0.37 


0.40 


0.35 


0.35 


0.41 


0.42 


U{pne,Pne) 


0.99 


0.98 


0.95 


0.89 


0.75 


0.61 


Symmetric optimal strategy p opt 


0.30 


0.30 


0.26 


0.22 


0.19 


0.15 


U(p pt,Popt) 


0.99 


0.98 


0.96 


0.90 


0.87 


0.98 


Computation time 


2m5s 


3m44s 


7m62s 


15m45s 


26ml Is 


37m55s 




1 0.8 0.6 0.4 0.2 



Figure 5: Utility function (left) and its diagonal slice (right) for Aloha with 5 nodes. 

Fig. [5] depicts the plot of the utility function Uo(p',p, ■■■ ,p) for the network of 5 nodes for different 
values of p' and p. Here p' is a value of TransmitProb of a potentially selfish node, and p is a value for 
other nodes. You can also see the computed values of Nash Equlibrium (NE) parameter and symmetric 
optimal (Opt) parameter. 

Table |2] contains the found values of Nash Equilibrium for Aloha with different number of nodes. 
The experiments were done on a 8-node cluster, where each node uses Intel(R) Core(TM)2 Quad CPU 
2.66GHz processor. 

5.3 Parameterized Topology for Network Models 

There are situations The performance of some network protocols can depend not only on retransmission 
parameters as seen previously but also on the actual topology of the network. In this section we study 
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the impact of different topologies on the LMAC protocol. 

LMAC is a Lightweight Media Access protocol (studied in (71 0) used for scheduling communi- 
cation in wireless sensor networks where the topology is determined by physical location and radio 
connectivity of the individual nodes. One of the goals of the LMAC protocol is to minimize the number 
of collisions in the network and to reconfigure the network to avoid further collisions. The difficulty 
of studying such protocols stems from the fact that the topology is not known in advance and there are 
exponentially many topologies (at least n ■ 2 n ~ l for n nodes with one of them being a gateway), which 
makes systematic analysis of large networks impractical. In order to study the robustness of the LMAC 
protocol against collisions, we propose to examine hundreds of random topologies and then pick and 
focus on the most problematic ones. Listing Q] shows how a topology is declared in the Uppaal model: 
a two-dimensional array of boolean constants gives the adjacency matrix of the network graph. The 
receivers then use the guard canJiear [receiver] [sender] when listening for the broadcast channel 
synchronizations . 

■ 

1 const int NODES = 10; // number of nodes 

2 typedef int [0, NODES— 1] nodeicLt; // used to identify node 

3 typedef bool topology_t [ nodeid_t ][ nodeid_t ]; // type for topology 

4 const topology_t can_hear = #binarymatrix(NODES, NODES); // adjacency matrix 

-. , 

Listing 1: Network topology declaration in Uppaal model of LMAC. 



In this case we try networks of up to ten nodes and twice as many slots, whereas one slot per node is 
enough to schedule flawless communication if only nodes were perfectly aware of each others choices. 
We used a property i > r[O f ; mg <2ooo (coljcount > 42)] estimating the probability of having more than 42 
collisions after 2000 time units, which hints that there are perpetually reoccurring collisions. 

The prepared model is then processed by our parametric model-checker that instantiates the keyword 
#binarymatrix with a concrete random matrix and distributes the verification on a cluster of computers, 
one instance of the matrix per core. Each verification uses Uppaal-SMC. Using the naive randomization, 
a cluster of 32 cores (the same as in Section I5T21 can verify 10000 topological in 6h 50min. Figure [6] 
shows the five topologies that yield the highest probabilities. We used low confidence (95%) statistical 
parameters to gain performance, thus the estimated probabilities have large ±0.05 statistical error, but 
the found topologies can be studied further in UPPAAL- SMC. 



Pi = 0.630 p 2 = 0.629 p 3 = 0.617 p 4 = 0.607 p 5 = 0.599 

Figure 6: Highest probabilities found by model checking random topologies of 10 nodes. 

Alternatively we tried generating all graphs up to 10 nodes which are unlikely to be isomorphic. The 
procedure is not guaranteed to cover all non-isomorphic classes (it may miss some), but it is very simple 
and can be recursively described as follows: 

4 We detected 707 duplicates by a post-analysis of the generated instance. 
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1. Start with a topology consisting of just one node. 

2. Add a new node and consider two new topologies: 

(a) Connect the new node to all the old nodes, go to step 2 until enough nodes are added. 

(b) Leave the new node unconnected at all, go to step 2 until enough nodes are added. 

3. For every node in a topology, make a new topology by marking the node as a gateway. 

4. Get rid of the topologies where the gateway is not connected. 

Up to the step 2 the procedure generates 2"~ ! topologies which are non-isomorphic for sure, then steps 
4 and 5 contain basic heuristics how to pick a gateway, which may yield some isomorphic graphs due to 
symmetric gateways, but the overhead is small. 

Figure [7] shows the 5 cases that achieve the highest probability found by generating 5120 topologies 
of up to 10 nodes using our heuristics. The verification took about 3h 30min. The heuristic procedure 
has clear advantages over the randomized one but it is not exhaustive. On the other hand, the randomized 
method has the potential to find any topology but without any guarantee. 




Figure 7: Highest probabilities found by model checking generated topologies of 10 nodes. 



6 Conclusion 

This paper proposes new algorithms to distribute statistical model checking algorithms through a mas- 
ter/slaves architecture. Our results have been implemented in the UPPAAL SMC toolset. A series of 
experiments show that our approach scales better than existing solutions ETTl . 

As a future work, we will extend our distributed algorithms to the setting of rare events and un- 
bounded temporal properties. We shall also implement and distribute Bayesian extensions of the ap- 
proach we proposed in [ 13]. 
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